Please submit a html output of your R notebook and include summary statistics and explanation of the variables in the dataset. Also, please include an explanation of your model results.
library(data.table)
library(leaps)
Use the College dataset from ISLR2 library and use best subset selection, forward and backward selection methods to predict the number of applications received using the other variables.
dt <- data.table(ISLR2::College)
head(dt)
The ISLR2::College dataset contains “Statistics for a large number of US Colleges from the 1995 issue of US News and World Report.” If you want to learn more, I suggest visiting https://rdocumentation.org/packages/ISLR2/versions/1.3-1/topics/College.
summary(dt)
Private Apps Accept Enroll Top10perc Top25perc F.Undergrad
No :212 Min. : 81 Min. : 72 Min. : 35 Min. : 1.00 Min. : 9.0 Min. : 139
Yes:565 1st Qu.: 776 1st Qu.: 604 1st Qu.: 242 1st Qu.:15.00 1st Qu.: 41.0 1st Qu.: 992
Median : 1558 Median : 1110 Median : 434 Median :23.00 Median : 54.0 Median : 1707
Mean : 3002 Mean : 2019 Mean : 780 Mean :27.56 Mean : 55.8 Mean : 3700
3rd Qu.: 3624 3rd Qu.: 2424 3rd Qu.: 902 3rd Qu.:35.00 3rd Qu.: 69.0 3rd Qu.: 4005
Max. :48094 Max. :26330 Max. :6392 Max. :96.00 Max. :100.0 Max. :31643
P.Undergrad Outstate Room.Board Books Personal PhD Terminal
Min. : 1.0 Min. : 2340 Min. :1780 Min. : 96.0 Min. : 250 Min. : 8.00 Min. : 24.0
1st Qu.: 95.0 1st Qu.: 7320 1st Qu.:3597 1st Qu.: 470.0 1st Qu.: 850 1st Qu.: 62.00 1st Qu.: 71.0
Median : 353.0 Median : 9990 Median :4200 Median : 500.0 Median :1200 Median : 75.00 Median : 82.0
Mean : 855.3 Mean :10441 Mean :4358 Mean : 549.4 Mean :1341 Mean : 72.66 Mean : 79.7
3rd Qu.: 967.0 3rd Qu.:12925 3rd Qu.:5050 3rd Qu.: 600.0 3rd Qu.:1700 3rd Qu.: 85.00 3rd Qu.: 92.0
Max. :21836.0 Max. :21700 Max. :8124 Max. :2340.0 Max. :6800 Max. :103.00 Max. :100.0
S.F.Ratio perc.alumni Expend Grad.Rate
Min. : 2.50 Min. : 0.00 Min. : 3186 Min. : 10.00
1st Qu.:11.50 1st Qu.:13.00 1st Qu.: 6751 1st Qu.: 53.00
Median :13.60 Median :21.00 Median : 8377 Median : 65.00
Mean :14.09 Mean :22.74 Mean : 9660 Mean : 65.46
3rd Qu.:16.50 3rd Qu.:31.00 3rd Qu.:10830 3rd Qu.: 78.00
Max. :39.80 Max. :64.00 Max. :56233 Max. :118.00
df <- dewey::ifelsedata(data.frame(round(cor(dt[, !c("Private")]), 3)),
.85, "x >= y & x != 1", matchCols = FALSE)
rownames(df) <- colnames(df)
df
GGally::ggpairs(dt, mapping = ggplot2::aes(color = Private))
plot: [1,1] [---------------------------------------------------------------------------------------------] 0% est: 0s
plot: [1,2] [>--------------------------------------------------------------------------------------------] 1% est: 6s
plot: [1,3] [>--------------------------------------------------------------------------------------------] 1% est: 9s
plot: [1,4] [>--------------------------------------------------------------------------------------------] 1% est:10s
plot: [1,5] [>--------------------------------------------------------------------------------------------] 2% est:10s
plot: [1,6] [=>-------------------------------------------------------------------------------------------] 2% est:10s
plot: [1,7] [=>-------------------------------------------------------------------------------------------] 2% est:10s
plot: [1,8] [=>-------------------------------------------------------------------------------------------] 2% est:10s
plot: [1,9] [==>------------------------------------------------------------------------------------------] 3% est:10s
plot: [1,10] [==>-----------------------------------------------------------------------------------------] 3% est:10s
plot: [1,11] [==>-----------------------------------------------------------------------------------------] 3% est:10s
plot: [1,12] [==>-----------------------------------------------------------------------------------------] 4% est:10s
plot: [1,13] [===>----------------------------------------------------------------------------------------] 4% est:10s
plot: [1,14] [===>----------------------------------------------------------------------------------------] 4% est:10s
plot: [1,15] [===>----------------------------------------------------------------------------------------] 5% est:10s
plot: [1,16] [====>---------------------------------------------------------------------------------------] 5% est:10s
plot: [1,17] [====>---------------------------------------------------------------------------------------] 5% est:10s
plot: [1,18] [====>---------------------------------------------------------------------------------------] 6% est:10s
plot: [2,1] [====>----------------------------------------------------------------------------------------] 6% est:10s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [2,2] [=====>---------------------------------------------------------------------------------------] 6% est:11s
plot: [2,3] [=====>---------------------------------------------------------------------------------------] 6% est:11s
plot: [2,4] [=====>---------------------------------------------------------------------------------------] 7% est:10s
plot: [2,5] [======>--------------------------------------------------------------------------------------] 7% est:10s
plot: [2,6] [======>--------------------------------------------------------------------------------------] 7% est:10s
plot: [2,7] [======>--------------------------------------------------------------------------------------] 8% est:10s
plot: [2,8] [======>--------------------------------------------------------------------------------------] 8% est:10s
plot: [2,9] [=======>-------------------------------------------------------------------------------------] 8% est:10s
plot: [2,10] [=======>------------------------------------------------------------------------------------] 9% est: 9s
plot: [2,11] [=======>------------------------------------------------------------------------------------] 9% est: 9s
plot: [2,12] [========>-----------------------------------------------------------------------------------] 9% est: 9s
plot: [2,13] [========>-----------------------------------------------------------------------------------] 10% est: 9s
plot: [2,14] [========>-----------------------------------------------------------------------------------] 10% est: 9s
plot: [2,15] [========>-----------------------------------------------------------------------------------] 10% est: 9s
plot: [2,16] [=========>----------------------------------------------------------------------------------] 10% est: 9s
plot: [2,17] [=========>----------------------------------------------------------------------------------] 11% est: 9s
plot: [2,18] [=========>----------------------------------------------------------------------------------] 11% est: 9s
plot: [3,1] [==========>----------------------------------------------------------------------------------] 11% est: 9s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [3,2] [==========>----------------------------------------------------------------------------------] 12% est: 9s
plot: [3,3] [==========>----------------------------------------------------------------------------------] 12% est: 9s
plot: [3,4] [==========>----------------------------------------------------------------------------------] 12% est: 9s
plot: [3,5] [===========>---------------------------------------------------------------------------------] 13% est: 9s
plot: [3,6] [===========>---------------------------------------------------------------------------------] 13% est: 9s
plot: [3,7] [===========>---------------------------------------------------------------------------------] 13% est: 9s
plot: [3,8] [============>--------------------------------------------------------------------------------] 14% est: 9s
plot: [3,9] [============>--------------------------------------------------------------------------------] 14% est: 8s
plot: [3,10] [============>-------------------------------------------------------------------------------] 14% est: 8s
plot: [3,11] [============>-------------------------------------------------------------------------------] 15% est: 8s
plot: [3,12] [=============>------------------------------------------------------------------------------] 15% est: 8s
plot: [3,13] [=============>------------------------------------------------------------------------------] 15% est: 8s
plot: [3,14] [=============>------------------------------------------------------------------------------] 15% est: 8s
plot: [3,15] [=============>------------------------------------------------------------------------------] 16% est: 8s
plot: [3,16] [==============>-----------------------------------------------------------------------------] 16% est: 8s
plot: [3,17] [==============>-----------------------------------------------------------------------------] 16% est: 8s
plot: [3,18] [==============>-----------------------------------------------------------------------------] 17% est: 8s
plot: [4,1] [===============>-----------------------------------------------------------------------------] 17% est: 8s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [4,2] [===============>-----------------------------------------------------------------------------] 17% est: 8s
plot: [4,3] [===============>-----------------------------------------------------------------------------] 18% est: 8s
plot: [4,4] [================>----------------------------------------------------------------------------] 18% est: 8s
plot: [4,5] [================>----------------------------------------------------------------------------] 18% est: 8s
plot: [4,6] [================>----------------------------------------------------------------------------] 19% est: 8s
plot: [4,7] [=================>---------------------------------------------------------------------------] 19% est: 8s
plot: [4,8] [=================>---------------------------------------------------------------------------] 19% est: 8s
plot: [4,9] [=================>---------------------------------------------------------------------------] 19% est: 8s
plot: [4,10] [=================>--------------------------------------------------------------------------] 20% est: 8s
plot: [4,11] [=================>--------------------------------------------------------------------------] 20% est: 8s
plot: [4,12] [==================>-------------------------------------------------------------------------] 20% est: 8s
plot: [4,13] [==================>-------------------------------------------------------------------------] 21% est: 7s
plot: [4,14] [==================>-------------------------------------------------------------------------] 21% est: 7s
plot: [4,15] [===================>------------------------------------------------------------------------] 21% est: 8s
plot: [4,16] [===================>------------------------------------------------------------------------] 22% est: 8s
plot: [4,17] [===================>------------------------------------------------------------------------] 22% est: 8s
plot: [4,18] [===================>------------------------------------------------------------------------] 22% est: 8s
plot: [5,1] [====================>------------------------------------------------------------------------] 23% est: 8s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [5,2] [====================>------------------------------------------------------------------------] 23% est: 8s
plot: [5,3] [=====================>-----------------------------------------------------------------------] 23% est: 8s
plot: [5,4] [=====================>-----------------------------------------------------------------------] 23% est: 8s
plot: [5,5] [=====================>-----------------------------------------------------------------------] 24% est: 8s
plot: [5,6] [=====================>-----------------------------------------------------------------------] 24% est: 8s
plot: [5,7] [======================>----------------------------------------------------------------------] 24% est: 8s
plot: [5,8] [======================>----------------------------------------------------------------------] 25% est: 7s
plot: [5,9] [======================>----------------------------------------------------------------------] 25% est: 7s
plot: [5,10] [======================>---------------------------------------------------------------------] 25% est: 7s
plot: [5,11] [=======================>--------------------------------------------------------------------] 26% est: 7s
plot: [5,12] [=======================>--------------------------------------------------------------------] 26% est: 7s
plot: [5,13] [=======================>--------------------------------------------------------------------] 26% est: 7s
plot: [5,14] [=======================>--------------------------------------------------------------------] 27% est: 7s
plot: [5,15] [========================>-------------------------------------------------------------------] 27% est: 7s
plot: [5,16] [========================>-------------------------------------------------------------------] 27% est: 7s
plot: [5,17] [========================>-------------------------------------------------------------------] 27% est: 7s
plot: [5,18] [=========================>------------------------------------------------------------------] 28% est: 7s
plot: [6,1] [=========================>-------------------------------------------------------------------] 28% est: 7s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [6,2] [=========================>-------------------------------------------------------------------] 28% est: 7s
plot: [6,3] [==========================>------------------------------------------------------------------] 29% est: 7s
plot: [6,4] [==========================>------------------------------------------------------------------] 29% est: 7s
plot: [6,5] [==========================>------------------------------------------------------------------] 29% est: 7s
plot: [6,6] [===========================>-----------------------------------------------------------------] 30% est: 7s
plot: [6,7] [===========================>-----------------------------------------------------------------] 30% est: 7s
plot: [6,8] [===========================>-----------------------------------------------------------------] 30% est: 7s
plot: [6,9] [===========================>-----------------------------------------------------------------] 31% est: 7s
plot: [6,10] [===========================>----------------------------------------------------------------] 31% est: 7s
plot: [6,11] [============================>---------------------------------------------------------------] 31% est: 7s
plot: [6,12] [============================>---------------------------------------------------------------] 31% est: 7s
plot: [6,13] [============================>---------------------------------------------------------------] 32% est: 7s
plot: [6,14] [=============================>--------------------------------------------------------------] 32% est: 7s
plot: [6,15] [=============================>--------------------------------------------------------------] 32% est: 7s
plot: [6,16] [=============================>--------------------------------------------------------------] 33% est: 7s
plot: [6,17] [=============================>--------------------------------------------------------------] 33% est: 7s
plot: [6,18] [==============================>-------------------------------------------------------------] 33% est: 6s
plot: [7,1] [==============================>--------------------------------------------------------------] 34% est: 6s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [7,2] [===============================>-------------------------------------------------------------] 34% est: 6s
plot: [7,3] [===============================>-------------------------------------------------------------] 34% est: 6s
plot: [7,4] [===============================>-------------------------------------------------------------] 35% est: 6s
plot: [7,5] [===============================>-------------------------------------------------------------] 35% est: 6s
plot: [7,6] [================================>------------------------------------------------------------] 35% est: 6s
plot: [7,7] [================================>------------------------------------------------------------] 35% est: 6s
plot: [7,8] [================================>------------------------------------------------------------] 36% est: 6s
plot: [7,9] [=================================>-----------------------------------------------------------] 36% est: 6s
plot: [7,10] [=================================>----------------------------------------------------------] 36% est: 6s
plot: [7,11] [=================================>----------------------------------------------------------] 37% est: 6s
plot: [7,12] [=================================>----------------------------------------------------------] 37% est: 6s
plot: [7,13] [=================================>----------------------------------------------------------] 37% est: 6s
plot: [7,14] [==================================>---------------------------------------------------------] 38% est: 6s
plot: [7,15] [==================================>---------------------------------------------------------] 38% est: 6s
plot: [7,16] [==================================>---------------------------------------------------------] 38% est: 6s
plot: [7,17] [==================================>---------------------------------------------------------] 39% est: 6s
plot: [7,18] [===================================>--------------------------------------------------------] 39% est: 6s
plot: [8,1] [===================================>---------------------------------------------------------] 39% est: 6s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [8,2] [====================================>--------------------------------------------------------] 40% est: 6s
plot: [8,3] [====================================>--------------------------------------------------------] 40% est: 6s
plot: [8,4] [====================================>--------------------------------------------------------] 40% est: 6s
plot: [8,5] [=====================================>-------------------------------------------------------] 40% est: 6s
plot: [8,6] [=====================================>-------------------------------------------------------] 41% est: 6s
plot: [8,7] [=====================================>-------------------------------------------------------] 41% est: 6s
plot: [8,8] [=====================================>-------------------------------------------------------] 41% est: 6s
plot: [8,9] [======================================>------------------------------------------------------] 42% est: 6s
plot: [8,10] [======================================>-----------------------------------------------------] 42% est: 6s
plot: [8,11] [======================================>-----------------------------------------------------] 42% est: 6s
plot: [8,12] [======================================>-----------------------------------------------------] 43% est: 6s
plot: [8,13] [======================================>-----------------------------------------------------] 43% est: 6s
plot: [8,14] [=======================================>----------------------------------------------------] 43% est: 6s
plot: [8,15] [=======================================>----------------------------------------------------] 44% est: 5s
plot: [8,16] [=======================================>----------------------------------------------------] 44% est: 5s
plot: [8,17] [========================================>---------------------------------------------------] 44% est: 5s
plot: [8,18] [========================================>---------------------------------------------------] 44% est: 5s
plot: [9,1] [=========================================>---------------------------------------------------] 45% est: 5s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [9,2] [=========================================>---------------------------------------------------] 45% est: 5s
plot: [9,3] [=========================================>---------------------------------------------------] 45% est: 5s
plot: [9,4] [=========================================>---------------------------------------------------] 46% est: 5s
plot: [9,5] [==========================================>--------------------------------------------------] 46% est: 5s
plot: [9,6] [==========================================>--------------------------------------------------] 46% est: 5s
plot: [9,7] [==========================================>--------------------------------------------------] 47% est: 5s
plot: [9,8] [===========================================>-------------------------------------------------] 47% est: 5s
plot: [9,9] [===========================================>-------------------------------------------------] 47% est: 5s
plot: [9,10] [===========================================>------------------------------------------------] 48% est: 5s
plot: [9,11] [===========================================>------------------------------------------------] 48% est: 5s
plot: [9,12] [===========================================>------------------------------------------------] 48% est: 5s
plot: [9,13] [============================================>-----------------------------------------------] 48% est: 5s
plot: [9,14] [============================================>-----------------------------------------------] 49% est: 5s
plot: [9,15] [============================================>-----------------------------------------------] 49% est: 5s
plot: [9,16] [============================================>-----------------------------------------------] 49% est: 5s
plot: [9,17] [=============================================>----------------------------------------------] 50% est: 5s
plot: [9,18] [=============================================>----------------------------------------------] 50% est: 5s
plot: [10,1] [=============================================>----------------------------------------------] 50% est: 5s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [10,2] [==============================================>---------------------------------------------] 51% est: 5s
plot: [10,3] [==============================================>---------------------------------------------] 51% est: 5s
plot: [10,4] [==============================================>---------------------------------------------] 51% est: 5s
plot: [10,5] [==============================================>---------------------------------------------] 52% est: 5s
plot: [10,6] [===============================================>--------------------------------------------] 52% est: 5s
plot: [10,7] [===============================================>--------------------------------------------] 52% est: 5s
plot: [10,8] [===============================================>--------------------------------------------] 52% est: 5s
plot: [10,9] [================================================>-------------------------------------------] 53% est: 5s
plot: [10,10] [===============================================>-------------------------------------------] 53% est: 5s
plot: [10,11] [================================================>------------------------------------------] 53% est: 5s
plot: [10,12] [================================================>------------------------------------------] 54% est: 4s
plot: [10,13] [================================================>------------------------------------------] 54% est: 4s
plot: [10,14] [================================================>------------------------------------------] 54% est: 4s
plot: [10,15] [=================================================>-----------------------------------------] 55% est: 4s
plot: [10,16] [=================================================>-----------------------------------------] 55% est: 4s
plot: [10,17] [=================================================>-----------------------------------------] 55% est: 4s
plot: [10,18] [==================================================>----------------------------------------] 56% est: 4s
plot: [11,1] [==================================================>-----------------------------------------] 56% est: 4s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [11,2] [===================================================>----------------------------------------] 56% est: 4s
plot: [11,3] [===================================================>----------------------------------------] 56% est: 4s
plot: [11,4] [===================================================>----------------------------------------] 57% est: 4s
plot: [11,5] [====================================================>---------------------------------------] 57% est: 4s
plot: [11,6] [====================================================>---------------------------------------] 57% est: 4s
plot: [11,7] [====================================================>---------------------------------------] 58% est: 4s
plot: [11,8] [====================================================>---------------------------------------] 58% est: 4s
plot: [11,9] [=====================================================>--------------------------------------] 58% est: 4s
plot: [11,10] [====================================================>--------------------------------------] 59% est: 4s
plot: [11,11] [=====================================================>-------------------------------------] 59% est: 4s
plot: [11,12] [=====================================================>-------------------------------------] 59% est: 4s
plot: [11,13] [=====================================================>-------------------------------------] 60% est: 4s
plot: [11,14] [=====================================================>-------------------------------------] 60% est: 4s
plot: [11,15] [======================================================>------------------------------------] 60% est: 4s
plot: [11,16] [======================================================>------------------------------------] 60% est: 4s
plot: [11,17] [======================================================>------------------------------------] 61% est: 4s
plot: [11,18] [=======================================================>-----------------------------------] 61% est: 4s
plot: [12,1] [========================================================>-----------------------------------] 61% est: 4s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [12,2] [========================================================>-----------------------------------] 62% est: 4s
plot: [12,3] [========================================================>-----------------------------------] 62% est: 4s
plot: [12,4] [========================================================>-----------------------------------] 62% est: 4s
plot: [12,5] [=========================================================>----------------------------------] 63% est: 4s
plot: [12,6] [=========================================================>----------------------------------] 63% est: 4s
plot: [12,7] [=========================================================>----------------------------------] 63% est: 4s
plot: [12,8] [=========================================================>----------------------------------] 64% est: 4s
plot: [12,9] [==========================================================>---------------------------------] 64% est: 3s
plot: [12,10] [=========================================================>---------------------------------] 64% est: 3s
plot: [12,11] [==========================================================>--------------------------------] 65% est: 3s
plot: [12,12] [==========================================================>--------------------------------] 65% est: 3s
plot: [12,13] [==========================================================>--------------------------------] 65% est: 3s
plot: [12,14] [===========================================================>-------------------------------] 65% est: 3s
plot: [12,15] [===========================================================>-------------------------------] 66% est: 3s
plot: [12,16] [===========================================================>-------------------------------] 66% est: 3s
plot: [12,17] [===========================================================>-------------------------------] 66% est: 3s
plot: [12,18] [============================================================>------------------------------] 67% est: 3s
plot: [13,1] [=============================================================>------------------------------] 67% est: 3s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [13,2] [=============================================================>------------------------------] 67% est: 3s
plot: [13,3] [=============================================================>------------------------------] 68% est: 3s
plot: [13,4] [=============================================================>------------------------------] 68% est: 3s
plot: [13,5] [==============================================================>-----------------------------] 68% est: 3s
plot: [13,6] [==============================================================>-----------------------------] 69% est: 3s
plot: [13,7] [==============================================================>-----------------------------] 69% est: 3s
plot: [13,8] [===============================================================>----------------------------] 69% est: 3s
plot: [13,9] [===============================================================>----------------------------] 69% est: 3s
plot: [13,10] [==============================================================>----------------------------] 70% est: 3s
plot: [13,11] [===============================================================>---------------------------] 70% est: 3s
plot: [13,12] [===============================================================>---------------------------] 70% est: 3s
plot: [13,13] [===============================================================>---------------------------] 71% est: 3s
plot: [13,14] [================================================================>--------------------------] 71% est: 3s
plot: [13,15] [================================================================>--------------------------] 71% est: 3s
plot: [13,16] [================================================================>--------------------------] 72% est: 3s
plot: [13,17] [================================================================>--------------------------] 72% est: 3s
plot: [13,18] [=================================================================>-------------------------] 72% est: 3s
plot: [14,1] [==================================================================>-------------------------] 73% est: 3s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [14,2] [==================================================================>-------------------------] 73% est: 3s
plot: [14,3] [==================================================================>-------------------------] 73% est: 3s
plot: [14,4] [===================================================================>------------------------] 73% est: 3s
plot: [14,5] [===================================================================>------------------------] 74% est: 3s
plot: [14,6] [===================================================================>------------------------] 74% est: 3s
plot: [14,7] [===================================================================>------------------------] 74% est: 2s
plot: [14,8] [====================================================================>-----------------------] 75% est: 2s
plot: [14,9] [====================================================================>-----------------------] 75% est: 2s
plot: [14,10] [====================================================================>----------------------] 75% est: 2s
plot: [14,11] [====================================================================>----------------------] 76% est: 2s
plot: [14,12] [====================================================================>----------------------] 76% est: 2s
plot: [14,13] [====================================================================>----------------------] 76% est: 2s
plot: [14,14] [=====================================================================>---------------------] 77% est: 2s
plot: [14,15] [=====================================================================>---------------------] 77% est: 2s
plot: [14,16] [=====================================================================>---------------------] 77% est: 2s
plot: [14,17] [=====================================================================>---------------------] 77% est: 2s
plot: [14,18] [======================================================================>--------------------] 78% est: 2s
plot: [15,1] [=======================================================================>--------------------] 78% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [15,2] [=======================================================================>--------------------] 78% est: 2s
plot: [15,3] [=======================================================================>--------------------] 79% est: 2s
plot: [15,4] [========================================================================>-------------------] 79% est: 2s
plot: [15,5] [========================================================================>-------------------] 79% est: 2s
plot: [15,6] [========================================================================>-------------------] 80% est: 2s
plot: [15,7] [=========================================================================>------------------] 80% est: 2s
plot: [15,8] [=========================================================================>------------------] 80% est: 2s
plot: [15,9] [=========================================================================>------------------] 81% est: 2s
plot: [15,10] [=========================================================================>-----------------] 81% est: 2s
plot: [15,11] [=========================================================================>-----------------] 81% est: 2s
plot: [15,12] [=========================================================================>-----------------] 81% est: 2s
plot: [15,13] [=========================================================================>-----------------] 82% est: 2s
plot: [15,14] [==========================================================================>----------------] 82% est: 2s
plot: [15,15] [==========================================================================>----------------] 82% est: 2s
plot: [15,16] [==========================================================================>----------------] 83% est: 2s
plot: [15,17] [===========================================================================>---------------] 83% est: 2s
plot: [15,18] [===========================================================================>---------------] 83% est: 2s
plot: [16,1] [============================================================================>---------------] 84% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [16,2] [============================================================================>---------------] 84% est: 2s
plot: [16,3] [=============================================================================>--------------] 84% est: 2s
plot: [16,4] [=============================================================================>--------------] 85% est: 1s
plot: [16,5] [=============================================================================>--------------] 85% est: 1s
plot: [16,6] [=============================================================================>--------------] 85% est: 1s
plot: [16,7] [==============================================================================>-------------] 85% est: 1s
plot: [16,8] [==============================================================================>-------------] 86% est: 1s
plot: [16,9] [==============================================================================>-------------] 86% est: 1s
plot: [16,10] [==============================================================================>------------] 86% est: 1s
plot: [16,11] [==============================================================================>------------] 87% est: 1s
plot: [16,12] [==============================================================================>------------] 87% est: 1s
plot: [16,13] [==============================================================================>------------] 87% est: 1s
plot: [16,14] [===============================================================================>-----------] 88% est: 1s
plot: [16,15] [===============================================================================>-----------] 88% est: 1s
plot: [16,16] [===============================================================================>-----------] 88% est: 1s
plot: [16,17] [================================================================================>----------] 89% est: 1s
plot: [16,18] [================================================================================>----------] 89% est: 1s
plot: [17,1] [=================================================================================>----------] 89% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [17,2] [=================================================================================>----------] 90% est: 1s
plot: [17,3] [==================================================================================>---------] 90% est: 1s
plot: [17,4] [==================================================================================>---------] 90% est: 1s
plot: [17,5] [==================================================================================>---------] 90% est: 1s
plot: [17,6] [==================================================================================>---------] 91% est: 1s
plot: [17,7] [===================================================================================>--------] 91% est: 1s
plot: [17,8] [===================================================================================>--------] 91% est: 1s
plot: [17,9] [===================================================================================>--------] 92% est: 1s
plot: [17,10] [===================================================================================>-------] 92% est: 1s
plot: [17,11] [===================================================================================>-------] 92% est: 1s
plot: [17,12] [===================================================================================>-------] 93% est: 1s
plot: [17,13] [====================================================================================>------] 93% est: 1s
plot: [17,14] [====================================================================================>------] 93% est: 1s
plot: [17,15] [====================================================================================>------] 94% est: 1s
plot: [17,16] [====================================================================================>------] 94% est: 1s
plot: [17,17] [=====================================================================================>-----] 94% est: 1s
plot: [17,18] [=====================================================================================>-----] 94% est: 1s
plot: [18,1] [======================================================================================>-----] 95% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plot: [18,2] [======================================================================================>-----] 95% est: 0s
plot: [18,3] [=======================================================================================>----] 95% est: 0s
plot: [18,4] [=======================================================================================>----] 96% est: 0s
plot: [18,5] [=======================================================================================>----] 96% est: 0s
plot: [18,6] [========================================================================================>---] 96% est: 0s
plot: [18,7] [========================================================================================>---] 97% est: 0s
plot: [18,8] [========================================================================================>---] 97% est: 0s
plot: [18,9] [========================================================================================>---] 97% est: 0s
plot: [18,10] [========================================================================================>--] 98% est: 0s
plot: [18,11] [========================================================================================>--] 98% est: 0s
plot: [18,12] [========================================================================================>--] 98% est: 0s
plot: [18,13] [=========================================================================================>-] 98% est: 0s
plot: [18,14] [=========================================================================================>-] 99% est: 0s
plot: [18,15] [=========================================================================================>-] 99% est: 0s
plot: [18,16] [=========================================================================================>-] 99% est: 0s
plot: [18,17] [==========================================================================================>]100% est: 0s
plot: [18,18] [===========================================================================================]100% est: 0s
These are the summary statistics. There’s nothing crazy about them in general. Maybe a few outliers. There aren’t a ton of things that are correlated at a rate at or higher than \(85\%\). It is all generally expected.
best_fit <- regsubsets(Apps ~ ., dt, nvmax = 17)
best_summary <- summary(best_fit)
data.table("BIC" = best_summary$bic,
"Cp" = best_summary$cp,
"r2" = best_summary$adjr2)[order(r2 * -1, BIC, Cp)]
par(mfrow = c(1,2))
plot(best_summary$cp, xlab = "number of features", ylab = "cp")
plot(best_fit, scale = "Cp")
par(mfrow = c(1, 2))
plot(best_summary$bic, xlab = "number of features", ylab = "bic")
plot(best_fit, scale = "bic")
normal <- as.formula("Apps ~ + Private + Accept + Enroll + Top10perc + Top25perc + Outstate + Room.Board + PhD + Expend + Grad.Rate")
best_fit <- regsubsets(Apps ~ ., dt, nvmax = 17, method = "forward")
best_summary <- summary(best_fit)
data.table("BIC" = best_summary$bic,
"Cp" = best_summary$cp,
"r2" = best_summary$adjr2)[order(r2 * -1, BIC, Cp)]
par(mfrow = c(1,2))
plot(best_summary$cp, xlab = "number of features", ylab = "cp")
plot(best_fit, scale = "Cp")
par(mfrow = c(1, 2))
plot(best_summary$bic, xlab = "number of features", ylab = "bic")
plot(best_fit, scale = "bic")
forward <- as.formula("Apps ~ + Private + Accept + Enroll + Top10perc + Top25perc + Outstate + Room.Board + PhD + Expend + Grad.Rate")
best_fit <- regsubsets(Apps ~ ., dt, nvmax = 17, method = "backward")
best_summary <- summary(best_fit)
data.table("BIC" = best_summary$bic,
"Cp" = best_summary$cp,
"r2" = best_summary$adjr2)[order(r2 * -1, BIC, Cp)]
par(mfrow = c(1,2))
plot(best_summary$cp, xlab = "number of features", ylab = "cp")
plot(best_fit, scale = "Cp")
par(mfrow = c(1, 2))
plot(best_summary$bic, xlab = "number of features", ylab = "bic")
plot(best_fit, scale = "bic")
backward <- as.formula("Apps ~ + Private + Accept + Enroll + Top10perc + Top25perc + Outstate + Room.Board + PhD + Expend + Grad.Rate")
regs <- dewey::regsearch(dt, "Apps", colnames(dt[, !c("Apps")]), 1, 10, "gaussian", 0, FALSE, TRUE)
[1] "Assembling regresions..."
| | 0 % ~calculating
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
[1] "Creating 109293 formulas. Please be patient, this may take a while."
[1] "Creating regressions..."
| | 0 % ~calculating
|+ | 1 % ~00s
|+ | 2 % ~00s
|++ | 3 % ~00s
|++ | 4 % ~00s
|+++ | 5 % ~00s
|+++ | 6 % ~00s
|++++ | 7 % ~00s
|++++ | 8 % ~00s
|+++++ | 9 % ~00s
|+++++ | 10% ~00s
|++++++ | 11% ~00s
|++++++ | 12% ~00s
|+++++++ | 13% ~00s
|+++++++ | 14% ~00s
|++++++++ | 15% ~00s
|++++++++ | 16% ~00s
|+++++++++ | 17% ~00s
|+++++++++ | 18% ~00s
|++++++++++ | 19% ~00s
|++++++++++ | 20% ~00s
|+++++++++++ | 21% ~00s
|+++++++++++ | 22% ~00s
|++++++++++++ | 23% ~00s
|++++++++++++ | 24% ~00s
|+++++++++++++ | 25% ~00s
|+++++++++++++ | 26% ~00s
|++++++++++++++ | 27% ~00s
|++++++++++++++ | 28% ~00s
|+++++++++++++++ | 29% ~00s
|+++++++++++++++ | 30% ~00s
|++++++++++++++++ | 31% ~00s
|++++++++++++++++ | 32% ~00s
|+++++++++++++++++ | 33% ~00s
|+++++++++++++++++ | 34% ~00s
|++++++++++++++++++ | 35% ~00s
|++++++++++++++++++ | 36% ~00s
|+++++++++++++++++++ | 37% ~00s
|+++++++++++++++++++ | 38% ~00s
|++++++++++++++++++++ | 39% ~00s
|++++++++++++++++++++ | 40% ~00s
|+++++++++++++++++++++ | 41% ~00s
|+++++++++++++++++++++ | 42% ~00s
|++++++++++++++++++++++ | 43% ~00s
|++++++++++++++++++++++ | 44% ~00s
|+++++++++++++++++++++++ | 45% ~00s
|+++++++++++++++++++++++ | 46% ~00s
|++++++++++++++++++++++++ | 47% ~00s
|++++++++++++++++++++++++ | 48% ~00s
|+++++++++++++++++++++++++ | 49% ~00s
|+++++++++++++++++++++++++ | 50% ~00s
|++++++++++++++++++++++++++ | 51% ~00s
|++++++++++++++++++++++++++ | 52% ~00s
|+++++++++++++++++++++++++++ | 53% ~00s
|+++++++++++++++++++++++++++ | 54% ~00s
|++++++++++++++++++++++++++++ | 55% ~00s
|++++++++++++++++++++++++++++ | 56% ~00s
|+++++++++++++++++++++++++++++ | 57% ~00s
|+++++++++++++++++++++++++++++ | 58% ~00s
|++++++++++++++++++++++++++++++ | 59% ~00s
|++++++++++++++++++++++++++++++ | 60% ~00s
|+++++++++++++++++++++++++++++++ | 61% ~00s
|+++++++++++++++++++++++++++++++ | 62% ~00s
|++++++++++++++++++++++++++++++++ | 63% ~00s
|++++++++++++++++++++++++++++++++ | 64% ~00s
|+++++++++++++++++++++++++++++++++ | 65% ~00s
|+++++++++++++++++++++++++++++++++ | 66% ~00s
|++++++++++++++++++++++++++++++++++ | 67% ~00s
|++++++++++++++++++++++++++++++++++ | 68% ~00s
|+++++++++++++++++++++++++++++++++++ | 69% ~00s
|+++++++++++++++++++++++++++++++++++ | 70% ~00s
|++++++++++++++++++++++++++++++++++++ | 71% ~00s
|++++++++++++++++++++++++++++++++++++ | 72% ~00s
|+++++++++++++++++++++++++++++++++++++ | 73% ~00s
|+++++++++++++++++++++++++++++++++++++ | 74% ~00s
|++++++++++++++++++++++++++++++++++++++ | 75% ~00s
|++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 78% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
[1] "Running 109293 regressions. Please be patient, this may take a while."
[1] "Running regressions..."
| | 0 % ~calculating
|+ | 1 % ~37s
|+ | 2 % ~32s
|++ | 3 % ~29s
|++ | 4 % ~28s
|+++ | 5 % ~27s
|+++ | 6 % ~27s
|++++ | 7 % ~27s
|++++ | 8 % ~26s
|+++++ | 9 % ~26s
|+++++ | 10% ~25s
|++++++ | 11% ~25s
|++++++ | 12% ~25s
|+++++++ | 13% ~25s
|+++++++ | 14% ~25s
|++++++++ | 15% ~25s
|++++++++ | 16% ~25s
|+++++++++ | 17% ~24s
|+++++++++ | 18% ~24s
|++++++++++ | 19% ~24s
|++++++++++ | 20% ~24s
|+++++++++++ | 21% ~23s
|+++++++++++ | 22% ~23s
|++++++++++++ | 23% ~23s
|++++++++++++ | 24% ~23s
|+++++++++++++ | 25% ~22s
|+++++++++++++ | 26% ~22s
|++++++++++++++ | 27% ~22s
|++++++++++++++ | 28% ~21s
|+++++++++++++++ | 29% ~21s
|+++++++++++++++ | 30% ~21s
|++++++++++++++++ | 31% ~20s
|++++++++++++++++ | 32% ~20s
|+++++++++++++++++ | 33% ~20s
|+++++++++++++++++ | 34% ~19s
|++++++++++++++++++ | 35% ~19s
|++++++++++++++++++ | 36% ~19s
|+++++++++++++++++++ | 37% ~18s
|+++++++++++++++++++ | 38% ~18s
|++++++++++++++++++++ | 39% ~18s
|++++++++++++++++++++ | 40% ~18s
|+++++++++++++++++++++ | 41% ~17s
|+++++++++++++++++++++ | 42% ~17s
|++++++++++++++++++++++ | 43% ~17s
|++++++++++++++++++++++ | 44% ~16s
|+++++++++++++++++++++++ | 45% ~16s
|+++++++++++++++++++++++ | 46% ~16s
|++++++++++++++++++++++++ | 47% ~16s
|++++++++++++++++++++++++ | 48% ~15s
|+++++++++++++++++++++++++ | 49% ~15s
|+++++++++++++++++++++++++ | 50% ~15s
|++++++++++++++++++++++++++ | 51% ~15s
|++++++++++++++++++++++++++ | 52% ~14s
|+++++++++++++++++++++++++++ | 53% ~14s
|+++++++++++++++++++++++++++ | 54% ~14s
|++++++++++++++++++++++++++++ | 55% ~13s
|++++++++++++++++++++++++++++ | 56% ~13s
|+++++++++++++++++++++++++++++ | 57% ~13s
|+++++++++++++++++++++++++++++ | 58% ~12s
|++++++++++++++++++++++++++++++ | 59% ~12s
|++++++++++++++++++++++++++++++ | 60% ~12s
|+++++++++++++++++++++++++++++++ | 61% ~12s
|+++++++++++++++++++++++++++++++ | 62% ~11s
|++++++++++++++++++++++++++++++++ | 63% ~11s
|++++++++++++++++++++++++++++++++ | 64% ~11s
|+++++++++++++++++++++++++++++++++ | 65% ~10s
|+++++++++++++++++++++++++++++++++ | 66% ~10s
|++++++++++++++++++++++++++++++++++ | 67% ~10s
|++++++++++++++++++++++++++++++++++ | 68% ~10s
|+++++++++++++++++++++++++++++++++++ | 69% ~09s
|+++++++++++++++++++++++++++++++++++ | 70% ~09s
|++++++++++++++++++++++++++++++++++++ | 71% ~09s
|++++++++++++++++++++++++++++++++++++ | 72% ~08s
|+++++++++++++++++++++++++++++++++++++ | 73% ~08s
|+++++++++++++++++++++++++++++++++++++ | 74% ~08s
|++++++++++++++++++++++++++++++++++++++ | 75% ~08s
|++++++++++++++++++++++++++++++++++++++ | 76% ~07s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~07s
|+++++++++++++++++++++++++++++++++++++++ | 78% ~07s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~06s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~06s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~06s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~05s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~05s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~05s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~05s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~04s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~04s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~04s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~03s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~03s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~03s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~02s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=30s
regs
dewey <- as.formula("Apps ~ + Accept + Top10perc")
# regsubsets produced the same arguments
forms <- c(normal, dewey)
lapply(forms, function(x) { summary(lm(formula = x, dt)) })
[[1]]
Call:
lm(formula = x, data = dt)
Residuals:
Min 1Q Median 3Q Max
-5085.2 -439.2 -27.4 315.6 7848.6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -100.51668 265.47592 -0.379 0.705069
PrivateYes -575.07061 132.52820 -4.339 1.62e-05 ***
Accept 1.58422 0.04011 39.500 < 2e-16 ***
Enroll -0.56221 0.11091 -5.069 5.02e-07 ***
Top10perc 49.13909 5.51638 8.908 < 2e-16 ***
Top25perc -13.86531 4.41751 -3.139 0.001762 **
Outstate -0.09466 0.01829 -5.176 2.89e-07 ***
Room.Board 0.16374 0.04668 3.508 0.000478 ***
PhD -10.01609 3.11921 -3.211 0.001378 **
Expend 0.07274 0.01142 6.370 3.26e-10 ***
Grad.Rate 7.33269 2.82114 2.599 0.009524 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1043 on 766 degrees of freedom
Multiple R-squared: 0.9283, Adjusted R-squared: 0.9274
F-statistic: 991.9 on 10 and 766 DF, p-value: < 2.2e-16
[[2]]
Call:
lm(formula = x, data = dt)
Residuals:
Min 1Q Median 3Q Max
-5334.2 -513.9 -16.7 325.1 9780.8
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -892.97561 77.89816 -11.46 <2e-16 ***
Accept 1.44004 0.01678 85.80 <2e-16 ***
Top10perc 35.83112 2.33210 15.36 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1125 on 774 degrees of freedom
Multiple R-squared: 0.9158, Adjusted R-squared: 0.9156
F-statistic: 4208 on 2 and 774 DF, p-value: < 2.2e-16
The model from regsubsets is a little bit more accurate but comes at the cost of needing to include many more variables. If data collection was no issue, the regsubsets one is good, if data collection is an issue, then mine is much better. Basically, as the number of applications accepted increases by one, the number of applications received increases by \(1.44\). As the percent of new students that were in the top 10% of their high school class increases by one, the number of applications received increases by \(35.83\).
Use the Boston data to predict the per capita crime rate using best subset selection, forward and backward selection methods.
dt <- data.table(ISLR2::Boston)
head(dt)
The ISLR2::Boston dataset contains “A data set containing housing values in 506 suburbs of Boston.” If you want to learn more, I suggest visiting https://rdocumentation.org/packages/ISLR2/versions/1.3-1/topics/Boston.
summary(dt)
crim zn indus chas nox rm age
Min. : 0.00632 Min. : 0.00 Min. : 0.46 Min. :0.00000 Min. :0.3850 Min. :3.561 Min. : 2.90
1st Qu.: 0.08205 1st Qu.: 0.00 1st Qu.: 5.19 1st Qu.:0.00000 1st Qu.:0.4490 1st Qu.:5.886 1st Qu.: 45.02
Median : 0.25651 Median : 0.00 Median : 9.69 Median :0.00000 Median :0.5380 Median :6.208 Median : 77.50
Mean : 3.61352 Mean : 11.36 Mean :11.14 Mean :0.06917 Mean :0.5547 Mean :6.285 Mean : 68.57
3rd Qu.: 3.67708 3rd Qu.: 12.50 3rd Qu.:18.10 3rd Qu.:0.00000 3rd Qu.:0.6240 3rd Qu.:6.623 3rd Qu.: 94.08
Max. :88.97620 Max. :100.00 Max. :27.74 Max. :1.00000 Max. :0.8710 Max. :8.780 Max. :100.00
dis rad tax ptratio lstat medv
Min. : 1.130 Min. : 1.000 Min. :187.0 Min. :12.60 Min. : 1.73 Min. : 5.00
1st Qu.: 2.100 1st Qu.: 4.000 1st Qu.:279.0 1st Qu.:17.40 1st Qu.: 6.95 1st Qu.:17.02
Median : 3.207 Median : 5.000 Median :330.0 Median :19.05 Median :11.36 Median :21.20
Mean : 3.795 Mean : 9.549 Mean :408.2 Mean :18.46 Mean :12.65 Mean :22.53
3rd Qu.: 5.188 3rd Qu.:24.000 3rd Qu.:666.0 3rd Qu.:20.20 3rd Qu.:16.95 3rd Qu.:25.00
Max. :12.127 Max. :24.000 Max. :711.0 Max. :22.00 Max. :37.97 Max. :50.00
df <- dewey::ifelsedata(data.frame(round(cor(dt), 3)),
.85, "x >= y & x != 1", matchCols = FALSE)
rownames(df) <- colnames(df)
df
GGally::ggpairs(dt)
plot: [1,1] [>--------------------------------------------------------------------------------------------] 1% est: 0s
plot: [1,2] [>--------------------------------------------------------------------------------------------] 1% est: 2s
plot: [1,3] [=>-------------------------------------------------------------------------------------------] 2% est: 2s
plot: [1,4] [=>-------------------------------------------------------------------------------------------] 2% est: 2s
plot: [1,5] [==>------------------------------------------------------------------------------------------] 3% est: 3s
plot: [1,6] [==>------------------------------------------------------------------------------------------] 4% est: 3s
plot: [1,7] [===>-----------------------------------------------------------------------------------------] 4% est: 3s
plot: [1,8] [===>-----------------------------------------------------------------------------------------] 5% est: 3s
plot: [1,9] [====>----------------------------------------------------------------------------------------] 5% est: 3s
plot: [1,10] [====>---------------------------------------------------------------------------------------] 6% est: 3s
plot: [1,11] [=====>--------------------------------------------------------------------------------------] 7% est: 3s
plot: [1,12] [======>-------------------------------------------------------------------------------------] 7% est: 3s
plot: [1,13] [======>-------------------------------------------------------------------------------------] 8% est: 3s
plot: [2,1] [=======>-------------------------------------------------------------------------------------] 8% est: 3s
plot: [2,2] [=======>-------------------------------------------------------------------------------------] 9% est: 3s
plot: [2,3] [========>------------------------------------------------------------------------------------] 9% est: 3s
plot: [2,4] [========>------------------------------------------------------------------------------------] 10% est: 3s
plot: [2,5] [=========>-----------------------------------------------------------------------------------] 11% est: 3s
plot: [2,6] [=========>-----------------------------------------------------------------------------------] 11% est: 3s
plot: [2,7] [==========>----------------------------------------------------------------------------------] 12% est: 3s
plot: [2,8] [===========>---------------------------------------------------------------------------------] 12% est: 3s
plot: [2,9] [===========>---------------------------------------------------------------------------------] 13% est: 3s
plot: [2,10] [============>-------------------------------------------------------------------------------] 14% est: 3s
plot: [2,11] [============>-------------------------------------------------------------------------------] 14% est: 3s
plot: [2,12] [=============>------------------------------------------------------------------------------] 15% est: 3s
plot: [2,13] [=============>------------------------------------------------------------------------------] 15% est: 3s
plot: [3,1] [==============>------------------------------------------------------------------------------] 16% est: 3s
plot: [3,2] [==============>------------------------------------------------------------------------------] 17% est: 3s
plot: [3,3] [===============>-----------------------------------------------------------------------------] 17% est: 3s
plot: [3,4] [================>----------------------------------------------------------------------------] 18% est: 2s
plot: [3,5] [================>----------------------------------------------------------------------------] 18% est: 2s
plot: [3,6] [=================>---------------------------------------------------------------------------] 19% est: 2s
plot: [3,7] [=================>---------------------------------------------------------------------------] 20% est: 2s
plot: [3,8] [==================>--------------------------------------------------------------------------] 20% est: 2s
plot: [3,9] [==================>--------------------------------------------------------------------------] 21% est: 2s
plot: [3,10] [===================>------------------------------------------------------------------------] 21% est: 2s
plot: [3,11] [===================>------------------------------------------------------------------------] 22% est: 2s
plot: [3,12] [====================>-----------------------------------------------------------------------] 22% est: 2s
plot: [3,13] [====================>-----------------------------------------------------------------------] 23% est: 2s
plot: [4,1] [=====================>-----------------------------------------------------------------------] 24% est: 2s
plot: [4,2] [======================>----------------------------------------------------------------------] 24% est: 2s
plot: [4,3] [======================>----------------------------------------------------------------------] 25% est: 2s
plot: [4,4] [=======================>---------------------------------------------------------------------] 25% est: 2s
plot: [4,5] [=======================>---------------------------------------------------------------------] 26% est: 2s
plot: [4,6] [========================>--------------------------------------------------------------------] 27% est: 2s
plot: [4,7] [========================>--------------------------------------------------------------------] 27% est: 2s
plot: [4,8] [=========================>-------------------------------------------------------------------] 28% est: 2s
plot: [4,9] [=========================>-------------------------------------------------------------------] 28% est: 2s
plot: [4,10] [==========================>-----------------------------------------------------------------] 29% est: 2s
plot: [4,11] [==========================>-----------------------------------------------------------------] 30% est: 2s
plot: [4,12] [===========================>----------------------------------------------------------------] 30% est: 2s
plot: [4,13] [===========================>----------------------------------------------------------------] 31% est: 2s
plot: [5,1] [============================>----------------------------------------------------------------] 31% est: 2s
plot: [5,2] [=============================>---------------------------------------------------------------] 32% est: 2s
plot: [5,3] [=============================>---------------------------------------------------------------] 33% est: 2s
plot: [5,4] [==============================>--------------------------------------------------------------] 33% est: 2s
plot: [5,5] [==============================>--------------------------------------------------------------] 34% est: 2s
plot: [5,6] [===============================>-------------------------------------------------------------] 34% est: 2s
plot: [5,7] [===============================>-------------------------------------------------------------] 35% est: 2s
plot: [5,8] [================================>------------------------------------------------------------] 36% est: 2s
plot: [5,9] [=================================>-----------------------------------------------------------] 36% est: 2s
plot: [5,10] [=================================>----------------------------------------------------------] 37% est: 2s
plot: [5,11] [=================================>----------------------------------------------------------] 37% est: 2s
plot: [5,12] [==================================>---------------------------------------------------------] 38% est: 2s
plot: [5,13] [==================================>---------------------------------------------------------] 38% est: 2s
plot: [6,1] [===================================>---------------------------------------------------------] 39% est: 2s
plot: [6,2] [====================================>--------------------------------------------------------] 40% est: 2s
plot: [6,3] [====================================>--------------------------------------------------------] 40% est: 2s
plot: [6,4] [=====================================>-------------------------------------------------------] 41% est: 2s
plot: [6,5] [======================================>------------------------------------------------------] 41% est: 2s
plot: [6,6] [======================================>------------------------------------------------------] 42% est: 2s
plot: [6,7] [=======================================>-----------------------------------------------------] 43% est: 2s
plot: [6,8] [=======================================>-----------------------------------------------------] 43% est: 2s
plot: [6,9] [========================================>----------------------------------------------------] 44% est: 2s
plot: [6,10] [========================================>---------------------------------------------------] 44% est: 2s
plot: [6,11] [========================================>---------------------------------------------------] 45% est: 2s
plot: [6,12] [=========================================>--------------------------------------------------] 46% est: 2s
plot: [6,13] [=========================================>--------------------------------------------------] 46% est: 2s
plot: [7,1] [==========================================>--------------------------------------------------] 47% est: 2s
plot: [7,2] [===========================================>-------------------------------------------------] 47% est: 2s
plot: [7,3] [============================================>------------------------------------------------] 48% est: 2s
plot: [7,4] [============================================>------------------------------------------------] 49% est: 2s
plot: [7,5] [=============================================>-----------------------------------------------] 49% est: 2s
plot: [7,6] [=============================================>-----------------------------------------------] 50% est: 2s
plot: [7,7] [==============================================>----------------------------------------------] 50% est: 2s
plot: [7,8] [==============================================>----------------------------------------------] 51% est: 2s
plot: [7,9] [===============================================>---------------------------------------------] 51% est: 1s
plot: [7,10] [===============================================>--------------------------------------------] 52% est: 1s
plot: [7,11] [===============================================>--------------------------------------------] 53% est: 1s
plot: [7,12] [================================================>-------------------------------------------] 53% est: 1s
plot: [7,13] [=================================================>------------------------------------------] 54% est: 1s
plot: [8,1] [==================================================>------------------------------------------] 54% est: 1s
plot: [8,2] [==================================================>------------------------------------------] 55% est: 1s
plot: [8,3] [===================================================>-----------------------------------------] 56% est: 1s
plot: [8,4] [===================================================>-----------------------------------------] 56% est: 1s
plot: [8,5] [====================================================>----------------------------------------] 57% est: 1s
plot: [8,6] [====================================================>----------------------------------------] 57% est: 1s
plot: [8,7] [=====================================================>---------------------------------------] 58% est: 1s
plot: [8,8] [=====================================================>---------------------------------------] 59% est: 1s
plot: [8,9] [======================================================>--------------------------------------] 59% est: 1s
plot: [8,10] [======================================================>-------------------------------------] 60% est: 1s
plot: [8,11] [=======================================================>------------------------------------] 60% est: 1s
plot: [8,12] [=======================================================>------------------------------------] 61% est: 1s
plot: [8,13] [========================================================>-----------------------------------] 62% est: 1s
plot: [9,1] [=========================================================>-----------------------------------] 62% est: 1s
plot: [9,2] [=========================================================>-----------------------------------] 63% est: 1s
plot: [9,3] [==========================================================>----------------------------------] 63% est: 1s
plot: [9,4] [==========================================================>----------------------------------] 64% est: 1s
plot: [9,5] [===========================================================>---------------------------------] 64% est: 1s
plot: [9,6] [============================================================>--------------------------------] 65% est: 1s
plot: [9,7] [============================================================>--------------------------------] 66% est: 1s
plot: [9,8] [=============================================================>-------------------------------] 66% est: 1s
plot: [9,9] [=============================================================>-------------------------------] 67% est: 1s
plot: [9,10] [=============================================================>------------------------------] 67% est: 1s
plot: [9,11] [==============================================================>-----------------------------] 68% est: 1s
plot: [9,12] [==============================================================>-----------------------------] 69% est: 1s
plot: [9,13] [===============================================================>----------------------------] 69% est: 1s
plot: [10,1] [===============================================================>----------------------------] 70% est: 1s
plot: [10,2] [================================================================>---------------------------] 70% est: 1s
plot: [10,3] [================================================================>---------------------------] 71% est: 1s
plot: [10,4] [=================================================================>--------------------------] 72% est: 1s
plot: [10,5] [=================================================================>--------------------------] 72% est: 1s
plot: [10,6] [==================================================================>-------------------------] 73% est: 1s
plot: [10,7] [===================================================================>------------------------] 73% est: 1s
plot: [10,8] [===================================================================>------------------------] 74% est: 1s
plot: [10,9] [====================================================================>-----------------------] 75% est: 1s
plot: [10,10] [===================================================================>-----------------------] 75% est: 1s
plot: [10,11] [====================================================================>----------------------] 76% est: 1s
plot: [10,12] [====================================================================>----------------------] 76% est: 1s
plot: [10,13] [=====================================================================>---------------------] 77% est: 1s
plot: [11,1] [======================================================================>---------------------] 78% est: 1s
plot: [11,2] [=======================================================================>--------------------] 78% est: 1s
plot: [11,3] [=======================================================================>--------------------] 79% est: 1s
plot: [11,4] [========================================================================>-------------------] 79% est: 1s
plot: [11,5] [========================================================================>-------------------] 80% est: 1s
plot: [11,6] [=========================================================================>------------------] 80% est: 1s
plot: [11,7] [==========================================================================>-----------------] 81% est: 1s
plot: [11,8] [==========================================================================>-----------------] 82% est: 1s
plot: [11,9] [===========================================================================>----------------] 82% est: 1s
plot: [11,10] [==========================================================================>----------------] 83% est: 1s
plot: [11,11] [===========================================================================>---------------] 83% est: 0s
plot: [11,12] [===========================================================================>---------------] 84% est: 0s
plot: [11,13] [============================================================================>--------------] 85% est: 0s
plot: [12,1] [=============================================================================>--------------] 85% est: 0s
plot: [12,2] [==============================================================================>-------------] 86% est: 0s
plot: [12,3] [==============================================================================>-------------] 86% est: 0s
plot: [12,4] [===============================================================================>------------] 87% est: 0s
plot: [12,5] [================================================================================>-----------] 88% est: 0s
plot: [12,6] [================================================================================>-----------] 88% est: 0s
plot: [12,7] [=================================================================================>----------] 89% est: 0s
plot: [12,8] [=================================================================================>----------] 89% est: 0s
plot: [12,9] [==================================================================================>---------] 90% est: 0s
plot: [12,10] [=================================================================================>---------] 91% est: 0s
plot: [12,11] [==================================================================================>--------] 91% est: 0s
plot: [12,12] [==================================================================================>--------] 92% est: 0s
plot: [12,13] [===================================================================================>-------] 92% est: 0s
plot: [13,1] [====================================================================================>-------] 93% est: 0s
plot: [13,2] [=====================================================================================>------] 93% est: 0s
plot: [13,3] [======================================================================================>-----] 94% est: 0s
plot: [13,4] [======================================================================================>-----] 95% est: 0s
plot: [13,5] [=======================================================================================>----] 95% est: 0s
plot: [13,6] [=======================================================================================>----] 96% est: 0s
plot: [13,7] [========================================================================================>---] 96% est: 0s
plot: [13,8] [========================================================================================>---] 97% est: 0s
plot: [13,9] [=========================================================================================>--] 98% est: 0s
plot: [13,10] [========================================================================================>--] 98% est: 0s
plot: [13,11] [=========================================================================================>-] 99% est: 0s
plot: [13,12] [=========================================================================================>-] 99% est: 0s
plot: [13,13] [===========================================================================================]100% est: 0s
There’s nothing crazy with these numbers. It is weird that only tax and rad are correlated above \(85%\), but then again highways decrease property taxes or something. idk.
best_fit <- regsubsets(crim ~ ., dt, nvmax = 12)
best_summary <- summary(best_fit)
data.table("BIC" = best_summary$bic,
"Cp" = best_summary$cp,
"r2" = best_summary$adjr2)[order(r2 * -1, BIC, Cp)]
par(mfrow = c(1,2))
plot(best_summary$cp, xlab = "number of features", ylab = "cp")
plot(best_fit, scale = "Cp")
par(mfrow = c(1, 2))
plot(best_summary$bic, xlab = "number of features", ylab = "bic")
plot(best_fit, scale = "bic")
normal <- as.formula("crim ~ + zn + nox + dis + rad + ptratio + lstat + medv")
best_fit <- regsubsets(crim ~ ., dt, nvmax = 12, method = "forward")
best_summary <- summary(best_fit)
data.table("BIC" = best_summary$bic,
"Cp" = best_summary$cp,
"r2" = best_summary$adjr2)[order(r2 * -1, BIC, Cp)]
par(mfrow = c(1,2))
plot(best_summary$cp, xlab = "number of features", ylab = "cp")
plot(best_fit, scale = "Cp")
par(mfrow = c(1, 2))
plot(best_summary$bic, xlab = "number of features", ylab = "bic")
plot(best_fit, scale = "bic")
forward <- as.formula("crim ~ + zn + nox + rm + dis + rad + ptratio + lstat + medv")
best_fit <- regsubsets(crim ~ ., dt, nvmax = 12, method = "backward")
best_summary <- summary(best_fit)
data.table("BIC" = best_summary$bic,
"Cp" = best_summary$cp,
"r2" = best_summary$adjr2)[order(r2 * -1, BIC, Cp)]
par(mfrow = c(1,2))
plot(best_summary$cp, xlab = "number of features", ylab = "cp")
plot(best_fit, scale = "Cp")
par(mfrow = c(1, 2))
plot(best_summary$bic, xlab = "number of features", ylab = "bic")
plot(best_fit, scale = "bic")
backward <- as.formula("crim ~ zn + nox + dis + rad + ptratio + lstat + medv")
regs <- dewey::regsearch(dt, "crim", c(colnames(dt[, !c("crim")]), "lstat*rad"), 1, 12, "gaussian", 0, FALSE, TRUE)
[1] "Assembling regresions..."
| | 0 % ~calculating
|+++++++++++++++++++++++++ | 50% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
[1] "Creating 8190 formulas. Please be patient, this may take a while."
[1] "Creating regressions..."
| | 0 % ~calculating
|+ | 1 % ~00s
|++ | 2 % ~00s
|++ | 3 % ~00s
|+++ | 4 % ~00s
|+++ | 5 % ~00s
|++++ | 7 % ~00s
|++++ | 8 % ~00s
|+++++ | 9 % ~00s
|+++++ | 10% ~00s
|++++++ | 11% ~00s
|+++++++ | 12% ~00s
|+++++++ | 13% ~00s
|++++++++ | 14% ~00s
|++++++++ | 15% ~00s
|+++++++++ | 16% ~00s
|+++++++++ | 18% ~00s
|++++++++++ | 19% ~00s
|++++++++++ | 20% ~00s
|+++++++++++ | 21% ~00s
|+++++++++++ | 22% ~00s
|++++++++++++ | 23% ~00s
|+++++++++++++ | 24% ~00s
|+++++++++++++ | 25% ~00s
|++++++++++++++ | 26% ~00s
|++++++++++++++ | 27% ~00s
|+++++++++++++++ | 29% ~00s
|+++++++++++++++ | 30% ~00s
|++++++++++++++++ | 31% ~00s
|++++++++++++++++ | 32% ~00s
|+++++++++++++++++ | 33% ~00s
|++++++++++++++++++ | 34% ~00s
|++++++++++++++++++ | 35% ~00s
|+++++++++++++++++++ | 36% ~00s
|+++++++++++++++++++ | 37% ~00s
|++++++++++++++++++++ | 38% ~00s
|++++++++++++++++++++ | 40% ~00s
|+++++++++++++++++++++ | 41% ~00s
|+++++++++++++++++++++ | 42% ~00s
|++++++++++++++++++++++ | 43% ~00s
|++++++++++++++++++++++ | 44% ~00s
|+++++++++++++++++++++++ | 45% ~00s
|++++++++++++++++++++++++ | 46% ~00s
|++++++++++++++++++++++++ | 47% ~00s
|+++++++++++++++++++++++++ | 48% ~00s
|+++++++++++++++++++++++++ | 49% ~00s
|++++++++++++++++++++++++++ | 51% ~00s
|++++++++++++++++++++++++++ | 52% ~00s
|+++++++++++++++++++++++++++ | 53% ~00s
|+++++++++++++++++++++++++++ | 54% ~00s
|++++++++++++++++++++++++++++ | 55% ~00s
|+++++++++++++++++++++++++++++ | 56% ~00s
|+++++++++++++++++++++++++++++ | 57% ~00s
|++++++++++++++++++++++++++++++ | 58% ~00s
|++++++++++++++++++++++++++++++ | 59% ~00s
|+++++++++++++++++++++++++++++++ | 60% ~00s
|+++++++++++++++++++++++++++++++ | 62% ~00s
|++++++++++++++++++++++++++++++++ | 63% ~00s
|++++++++++++++++++++++++++++++++ | 64% ~00s
|+++++++++++++++++++++++++++++++++ | 65% ~00s
|+++++++++++++++++++++++++++++++++ | 66% ~00s
|++++++++++++++++++++++++++++++++++ | 67% ~00s
|+++++++++++++++++++++++++++++++++++ | 68% ~00s
|+++++++++++++++++++++++++++++++++++ | 69% ~00s
|++++++++++++++++++++++++++++++++++++ | 70% ~00s
|++++++++++++++++++++++++++++++++++++ | 71% ~00s
|+++++++++++++++++++++++++++++++++++++ | 73% ~00s
|+++++++++++++++++++++++++++++++++++++ | 74% ~00s
|++++++++++++++++++++++++++++++++++++++ | 75% ~00s
|++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 80% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 82% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
[1] "Running 5119 regressions. Please be patient, this may take a while."
[1] "Running regressions..."
| | 0 % ~calculating
|+ | 1 % ~02s
|++ | 2 % ~02s
|++ | 3 % ~02s
|+++ | 5 % ~02s
|+++ | 6 % ~01s
|++++ | 7 % ~01s
|+++++ | 8 % ~01s
|+++++ | 9 % ~01s
|++++++ | 10% ~01s
|++++++ | 12% ~01s
|+++++++ | 13% ~01s
|+++++++ | 14% ~01s
|++++++++ | 15% ~01s
|+++++++++ | 16% ~01s
|+++++++++ | 17% ~01s
|++++++++++ | 19% ~01s
|++++++++++ | 20% ~01s
|+++++++++++ | 21% ~01s
|++++++++++++ | 22% ~01s
|++++++++++++ | 23% ~01s
|+++++++++++++ | 24% ~01s
|+++++++++++++ | 26% ~01s
|++++++++++++++ | 27% ~01s
|++++++++++++++ | 28% ~01s
|+++++++++++++++ | 29% ~01s
|++++++++++++++++ | 30% ~01s
|++++++++++++++++ | 31% ~01s
|+++++++++++++++++ | 33% ~01s
|+++++++++++++++++ | 34% ~01s
|++++++++++++++++++ | 35% ~01s
|+++++++++++++++++++ | 36% ~01s
|+++++++++++++++++++ | 37% ~01s
|++++++++++++++++++++ | 38% ~01s
|++++++++++++++++++++ | 40% ~01s
|+++++++++++++++++++++ | 41% ~01s
|+++++++++++++++++++++ | 42% ~01s
|++++++++++++++++++++++ | 43% ~01s
|+++++++++++++++++++++++ | 44% ~01s
|+++++++++++++++++++++++ | 45% ~01s
|++++++++++++++++++++++++ | 47% ~01s
|++++++++++++++++++++++++ | 48% ~01s
|+++++++++++++++++++++++++ | 49% ~01s
|+++++++++++++++++++++++++ | 50% ~01s
|++++++++++++++++++++++++++ | 51% ~01s
|+++++++++++++++++++++++++++ | 52% ~01s
|+++++++++++++++++++++++++++ | 53% ~01s
|++++++++++++++++++++++++++++ | 55% ~01s
|++++++++++++++++++++++++++++ | 56% ~01s
|+++++++++++++++++++++++++++++ | 57% ~01s
|++++++++++++++++++++++++++++++ | 58% ~01s
|++++++++++++++++++++++++++++++ | 59% ~01s
|+++++++++++++++++++++++++++++++ | 60% ~01s
|+++++++++++++++++++++++++++++++ | 62% ~01s
|++++++++++++++++++++++++++++++++ | 63% ~01s
|++++++++++++++++++++++++++++++++ | 64% ~01s
|+++++++++++++++++++++++++++++++++ | 65% ~00s
|++++++++++++++++++++++++++++++++++ | 66% ~00s
|++++++++++++++++++++++++++++++++++ | 67% ~00s
|+++++++++++++++++++++++++++++++++++ | 69% ~00s
|+++++++++++++++++++++++++++++++++++ | 70% ~00s
|++++++++++++++++++++++++++++++++++++ | 71% ~00s
|+++++++++++++++++++++++++++++++++++++ | 72% ~00s
|+++++++++++++++++++++++++++++++++++++ | 73% ~00s
|++++++++++++++++++++++++++++++++++++++ | 74% ~00s
|++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 78% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 80% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 88% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
regs
dewey <- as.formula("crim ~ + lstat + rad")
# regsubsets produced the same arguments for normal and backward
# dropped backward
forms <- c(normal, forward, dewey)
lapply(forms, function(x) { summary(lm(formula = x, dt)) })
[[1]]
Call:
lm(formula = x, data = dt)
Residuals:
Min 1Q Median 3Q Max
-8.655 -2.143 -0.319 1.050 74.740
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.46682 6.02424 2.899 0.003904 **
zn 0.04497 0.01803 2.494 0.012951 *
nox -12.45782 4.77637 -2.608 0.009375 **
dis -0.94255 0.26270 -3.588 0.000366 ***
rad 0.56152 0.04813 11.667 < 2e-16 ***
ptratio -0.34703 0.18288 -1.898 0.058322 .
lstat 0.11479 0.06945 1.653 0.098997 .
medv -0.19026 0.05369 -3.543 0.000432 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.452 on 498 degrees of freedom
Multiple R-squared: 0.4452, Adjusted R-squared: 0.4374
F-statistic: 57.08 on 7 and 498 DF, p-value: < 2.2e-16
[[2]]
Call:
lm(formula = x, data = dt)
Residuals:
Min 1Q Median 3Q Max
-8.724 -2.181 -0.288 1.081 73.947
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 13.25994 6.97229 1.902 0.057774 .
zn 0.04332 0.01807 2.397 0.016906 *
nox -12.50587 4.77446 -2.619 0.009080 **
rm 0.70899 0.59233 1.197 0.231896
dis -0.93005 0.26279 -3.539 0.000439 ***
rad 0.55378 0.04854 11.409 < 2e-16 ***
ptratio -0.33823 0.18294 -1.849 0.065079 .
lstat 0.13578 0.07160 1.896 0.058495 .
medv -0.21711 0.05817 -3.732 0.000212 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.449 on 497 degrees of freedom
Multiple R-squared: 0.4467, Adjusted R-squared: 0.4378
F-statistic: 50.17 on 8 and 497 DF, p-value: < 2.2e-16
[[3]]
Call:
lm(formula = x, data = dt)
Residuals:
Min 1Q Median 3Q Max
-8.953 -1.881 -0.249 1.040 76.726
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -4.38141 0.59872 -7.318 1.00e-12 ***
lstat 0.23728 0.04685 5.065 5.75e-07 ***
rad 0.52281 0.03842 13.607 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.559 on 503 degrees of freedom
Multiple R-squared: 0.4208, Adjusted R-squared: 0.4185
F-statistic: 182.7 on 2 and 503 DF, p-value: < 2.2e-16
Again, regsubsets produces slightly better models, but mine is almost as good and is more parsimonious. As lstat increases by one, crim increases by \(.237\) and when rad increases by one, crim increases by \(.522\).